Abstract

This study is a Replication of research done by Chakraborty (2021). Chakraborty analyses county level data to find trends between disability rates and COVID-19 infection rates. The study uses Bivariate Pearson product moment correlation and Generalized Estimating functions to test for the aforementioned correlations in 18 different socio-demographic categories. The purpose of the study is to analyze whether people with disabilities face disproportionate outcomes from COVID-19. T

This study is a replication of:

Chakraborty, J. 2021. Social inequities in the distribution of COVID-19: An intra-categorical analysis of people with disabilities in the U.S. Disability and Health Journal 14:1-5. https://doi.org/10.1016/j.dhjo.2020.101007

Study metadata

Original study spatio-temporal metadata

  • Spatial Coverage: extent of original study
  • Spatial Resolution: resolution of original study
  • Spatial Reference System: spatial reference system of original study
  • Temporal Coverage: temporal extent of original study
  • Temporal Resolution: temporal resolution of original study

Study design

The aim of this reproduction study is to implement the original study as closely as possible to reproduce the map of county level distribution of COVID-19 incidence rate, the summary statistics and bivariate correlation for disability characteristics and COVID-19 incidence.

Materials and procedure

Computational environment

Data and variable

American Community Survey (ACS) five-year estimate (2014-2018)

  • Title: American Community Survey (ACS) five-year estimate (2014-2018)
  • Abstract: Sociodemographic breakdown of disabled population
  • Spatial Coverage: United States
  • Spatial Resolution: County
  • Spatial Representation Type: Vector MULTIPOLYGON
  • Spatial Reference System: EPSG 4269
  • Temporal Coverage: 2014-2018
  • Temporal Resolution: five-year estimate
  • Lineage: Pulled documented variables from S1810 and C18130 tables using tidyCensus
  • Distribution: Publicly Available
  • Constraints: Public Data

The American Community Survey (ACS) five-year estimate (2014-2018) variables used in the study are outlined in the table below. Details on ACS data collection can be found at https://www.census.gov/topics/health/disability/guidance/data-collection-acs.html and details on sampling methods and accuracy can be found at https://www.census.gov/programs-surveys/acs/technical-documentation/code-lists.html.

Disability Subgroup Variables
Variable Name in Study ACS Variable name
percent of total civilian non-institutionalized population with a disability S1810_C03_001E
Race
percent w disability: White alone S1810_C03_004E
percent w disability: Black alone S1810_C03_005E
percent w disability: Native American S1810_C03_006E
percent w disability: Asian alone S1810_C03_007E
percent w disability: Other race S1810_C03_009E
Ethnicity
percent w disability: Non-Hispanic White S1810_C03_0011E
percent w disability: Hispanic S1810_C03_012E
percent w disability: Non-Hispanic non-White (S1810_C02_001E - S1810_C02_011E - S1810_C02_012E) / (S1810_C01_001E - S1810_C01_011E - S1810_C01_012E) * 100
percent w disability: Other race S1810_C03_009E
Poverty
percent w disability: Below poverty level (C18130_004E + C18130_011E + C18130_018E) / C18130_001E * 100
percent w disability: Above poverty level (C18130_005E + C18130_012E + C18130_019E) / C18130_001E * 100
Age
percent w disability: 5-17 S1810_C03_014E
percent w disability: 18-34 S1810_C03_015E
percent w disability: 35-64 S1810_C03_016E
percent w disability: 65-74 S1810_C03_017E
percent w disability: 75+ S1810_C03_018E
Biological sex
percent w disability: male S1810_C03_001E
percent w disability: female S1810_C03_003E

American Community Survey (ACS) data for sociodemographic subcategories of people with disabilities can be accessed by using the tidycensus package to query the Census API. This requires an API key which can be acquired at api.census.gov/data/key_signup.html.

County Level COVID-19 Incidence Rate

  • Title: County Level COVID-19 Incidence Rate
  • Abstract: Socioodemographic breakdown of disabled population
  • Spatial Coverage: United States
  • Spatial Resolution: County
  • Spatial Representation Type: Vector MULTIPOLYGON
  • Spatial Reference System: EPSG 4269
  • Temporal Coverage: 2020-01-22 — 2020-08-01
  • Temporal Resolution: 8 months
  • Lineage: Center for System Science in Engineering at Johns Hopkins University August, 01, 2020
  • Distribution: see below
  • Constraints: Public Data

Data on COVID-19 cases from the Johns Hopkins University dashboard have been provided directly with the research compendium because the data is no longer available online in the state in which it was downloaded on August 1, 2020. The dashboard and cumulative counts of COVID-19 cases and deaths were continually updated, so an exact reproduction required communication with the original author, Jayajit Chakraborty, for assistance with provision of data from August 1, 2020. The data includes an estimate of the total population (POP_ESTIMA) and confirmed COVID-19 cases (Confirmed). The COVID-19 case data expresses cumulative count of reported COVID-19 from 1/22/2020 to 8/1/2020. Although metadata for this particular resource is no longer available from the original source, one can reasonably assume that the total population estimate was based on the 2014-2018 5-year ACS estimate, as the 2019 estimates data had not been released yet.

Versions of the data can be found at the John Hopkins CCSE COVID-19 Data Repository (https://github.com/CSSEGISandData/COVID-19). However, archived data only provides summaries at the national scale. We received the COVID-19 case data through 8/1/2020 at the county level from the author, as there is no readily apparent way to access archived data from the Johns Hopkins University Center for Systems Science Engineering database.

Data transformations

Workflow

ACS data transformations

The original study extent is the lower 48 states and Washington D.C. Therefore, Alaska, Hawai’i and Puerto Rico are removed from the data (workflow step 1). Data on people with disabilities in poverty is derived from a different census table (C18130) than data on people with disabilities and age, race, ethnicity, age, and biological sex (S1810). Therefore, join the poverty data to the other data using the GEOID (workflow step 3). Also transform the ACS geographic data into Contiguous USA Albers Equal Area projection and fix geometry errors.

Optionally, save the raw ACS data to data/raw/public/acs.gpkg for use in GIS software.

Calculate independent socio-demographic variables of people with disabilities as percentages for each sub-category of disability (race, ethnicity, poverty, age, and biological sex) and remove raw census data from the data frame (workflow step 4). Reproject the data into an Albers equal area conic projection.

COVID-19 data transformations

Calculate the COVID incidence rate as the cases per 100,000 people (workflow step 2). Convert the COVID data to a non-geographic data frame.

Join dependent COVID data to independent ACS demographic data.

Missing data

Unplanned deviation for reproduction: There is one county with missing disability and poverty data. This was not mentioned in the original study or in our pre-analyis plan. However, we replace the missing data with zeros, producing results identical to Chakraborty’s.

fips statefp county county_st covid_rate dis_pct white_pct black_pct native_pct asian_pct other_pct non_hisp_white_pct hisp_pct non_hisp_non_white_pct bpov_pct apov_pct pct_5_17 pct_18_34 pct_35_64 pct_65_74 pct_75 male_pct female_pct pop cases x y
35039 35 Rio Arriba Rio Arriba County, New Mexico 751.17 16.06467 10.77458 0.038371 2.744807 0.038371 2.468536 2.355981 11.39619 2.312494 NA NA 0.3069682 1.258569 6.781439 3.391998 4.279648 8.556738 7.50793 39006 293 -106.6932 36.50962

Map COVID-19 incidence

Map the county level distribution of COVID-19 incidence rates, comparing to Figure 1 of the original study.

Map disability rates

Unplanned deviation for reproduction: We also map the spatial distribution of the percent of people with any disability to improve our understanding of the geographic patterns and relationships of between the overarching independent variable (percentage of people with disability) and the dependent variable (COVID-19 incidence rate).

Descriptive statistics

Calculate descriptive statistics for dependent COVID-19 rate and independent socio-demographic characteristics, reproducing the min, max, mean, and SD columns of original study table 1.

Planned deviation for reanalysis: We also calculate the Shapiro Wilk test for normality.

Reproduced Descriptive Statistics
min max mean SD ShapiroWilk p
covid_rate 0.00 14257.17 966.90 1003.96 0.74 0
dis_pct 3.83 33.71 15.95 4.40 0.98 0
white_pct 0.85 33.26 13.55 4.63 0.98 0
black_pct 0.00 20.70 1.48 2.66 0.61 0
native_pct 0.00 13.74 0.28 0.94 0.28 0
asian_pct 0.00 3.45 0.09 0.18 0.51 0
other_pct 0.00 15.24 0.55 0.65 0.57 0
non_hisp_white_pct 0.10 33.16 12.84 4.81 0.99 0
hisp_pct 0.00 25.26 0.99 2.15 0.42 0
non_hisp_non_white_pct 0.00 20.93 2.13 2.75 0.70 0
bpov_pct 0.00 14.97 3.57 1.85 0.93 0
apov_pct 0.00 27.30 12.48 3.06 0.99 0
pct_5_17 0.00 5.08 1.03 0.48 0.95 0
pct_18_34 0.00 5.59 1.56 0.67 0.96 0
pct_35_64 1.01 18.36 6.35 2.30 0.96 0
pct_65_74 0.00 12.73 3.09 1.16 0.95 0
pct_75 0.00 11.13 3.87 1.19 0.97 0
male_pct 1.30 18.19 8.06 2.37 0.98 0
female_pct 1.91 19.94 7.90 2.26 0.98 0

Compare reproduced descriptive statistics to original descriptive statistics. Difference is calculated as ‘reproduction study - original study’. Identical results will result in zero.

Descriptive Statistics Comparison
min max mean SD
covid_rate 0 0.17 -0.1 -0.04
dis_pct 0 0.00 0.0 0.00
white_pct 0 0.00 0.0 0.00
black_pct 0 0.00 0.0 0.00
native_pct 0 0.00 0.0 0.00
asian_pct 0 0.00 0.0 0.00
other_pct 0 0.00 0.0 0.00
non_hisp_white_pct 0 0.00 0.0 0.00
hisp_pct 0 0.00 0.0 0.00
non_hisp_non_white_pct 0 0.00 0.0 0.00
bpov_pct 0 0.00 0.0 0.00
apov_pct 0 0.00 0.0 0.00
pct_5_17 0 0.00 0.0 0.00
pct_18_34 0 0.00 0.0 0.00
pct_35_64 0 0.00 0.0 0.00
pct_65_74 0 0.00 0.0 0.00
pct_75 0 0.00 0.0 0.00
male_pct 0 0.00 0.0 0.00
female_pct 0 0.00 0.0 0.00

The descriptive statistics are identical, except that the original study seems to have rounded the COVID-19 statistics to zero decimal places.

Analysis

Workflow

Results

Describe how results are to be presented.

Discussion

Describe how the results are to be interpreted vis a vis each hypothesis or research question.

Integrity Statement

Include an integrity statement - The authors of this preregistration state that they completed this preregistration to the best of their knowledge and that no other preregistration exists pertaining to the same hypotheses and research. If a prior registration does exist, explain the rationale for revising the registration here.

Acknowledgements

This report is based upon the template for Reproducible and Replicable Research in Human-Environment and Geographical Sciences, DOI:[10.17605/OSF.IO/W29MQ](https://doi.org/10.17605/OSF.IO/W29MQ)

References